Exploiting deep learning network in optical chirality tuning and manipulation of diffractive chiral metamaterials


 Deep-learning (DL) network has emerged as an important prototyping technology for the advancements of big data analytics, intelligent systems, biochemistry, physics, and nanoscience. Here, we used a DL model whose key algorithm relies on deep neural network to efficiently predict circular dichroism (CD) response in higher-order diffracted beams of two-dimensional chiral metamaterials with different parameters. To facilitate the training process of DL network in predicting chiroptical response, the traditional rigorous coupled wave analysis (RCWA) method is utilized. Notably, these T-like shaped chiral metamaterials all exhibit the strongest CD response in the third-order diffracted beams whose intensities are the smallest, when comparing up to four diffraction orders. Our comprehensive results reveal that by means of DL network, the complex and nonintuitive relations between T-like metamaterials with different chiral parameters (i. e., unit period, width, bridge length, and separation length) and their CD performances are acquired, which owns an ultrafast computational speed that is four orders of magnitude faster than RCWA and a high accuracy. The insights gained from this study may be of assistance to the applications of DL network in investigating different optical chirality in low-dimensional metamaterials and expediting the design and optimization processes for hyper-sensitive ultrathin devices and systems.


Introduction
A renewed research interest has been focused on optical chirality, whose structure shape cannot be superimposed on its mirror image [1,2], inspiring a plethora of interesting and intriguing phenomena [3]. It has been evidenced that the immense and promising application prospects of optical chirality involve the fields of chemistry [4], life science [5], pharmaceutical synthesis [6], spectroscopy [7], spintronics [8], quantum computing [9,10], sensitive detection and imaging [11]. Circular dichroism (CD) spectroscopy is one of the most successful approaches to efficiently characterize the chiroptical response of chiral materials, which measures the differential absorption between the right-(RCP) and left-circularly polarized (LCP) light [12]. Notably, the enantiomers of chiral materials would interact differently with LCP and RCP light, determined by the structure handedness [13]. Though the chirality is an omnipresent part of nature, the chiroptical response of these natural materials is generally very weak, caused by the small electromagnetic interaction volume [14], creating difficulties in its high sensitivity detection and hindering the Zilong Tao, Jun Zhang and Jie You: These authors contributed equally to this work. future perspectives. Thanks to the recent progress of modern nanofabrication techniques, it is feasible to alter the chirality parameters of different artificial chiral materials and equip them with superior optical chirality than their natural counterparts [15]. One salient example is the chiral metamaterial [16], in which the localized surface plasmon (LSP) resonances would greatly boost the lightmatter interaction and then largely enhance the chiroptical responses [17,18]. When compared with the three-dimensional chiral metamaterials, the two-dimensional (2D) ones seem to be a better candidate for the exploration of optical chirality, considering their exceptional intrinsic properties that benefit the manufacturing of nano-devices requiring small optical losses, compact size, and high compatibility with the complementary metal-oxide-semiconductor (CMOS) foundries [19][20][21][22]. Equally important, the diffractive chiral metamaterials have also emerged as significant platforms to study optical chirality, whose CD responses at higher-order diffracted beams are usually far larger than the case of zeroth-order [3,23]. However, the complete investigation of diffractive metamaterials with plenty of geometry parameters is rarely found in literature [24]. A key issue in this context, which has yet to be explored, is the study of 2D chiral metamaterials with numerous chiral parameters via a highly-accurate and significantly-fast approach, as the simple cases of optical chirality in 2D chiral metamaterials with fixed dimensions have been previously addressed [3,25].
Recent trends in artificial intelligence have led to a proliferation of studies on deep learning (DL) algorithm and its utilization in diverse fields, such as biology [26][27][28], chemistry [29][30][31], and physics [32][33][34]. In particular, one important aspect pertaining to DL is its capability of characterizing and predicting the physical properties for photonic structures [35][36][37], including the reconstruction of ultrashort pulses [38,39], the wave-front sensing [40], and the design of metasurfaces [41], chiral metamaterials [42,43] and electromagnetic nanostructures [44][45][46][47]. Furthermore, the DL scheme has also penetrated into computational physics, covering the areas of estimating stress distribution [48], assisting computational mechanics [49], capturing nonlinear material behaviors [50] and predicting plasmonic colors [51], whose main advantages over the conventional finite element method are that it not only speeds up the investigation process, but also creates many nonintuitive designs with distinguished performance. It is worth noticing that DL model is a supervised form in machine learning (ML) that usually adopts backpropagation to train its network [52]. In the above studies, DL can perform end-to-end learning [53], extract features automatically [54], discover hidden features [55] and improve model accuracy [56]. The category of DL algorithms includes the deep neural network (DNN) containing fully-connected layer [57], recurrent neural network commonly used in contextual data [58], convolutional neural network most used in image recognition [59], stacked autoencoder frequently used in feature mining [60], and generative adversarial network regularly used in sample generation [61].
In this work, a DL network based on the DNN algorithm, namely the fully connected neural network (FC-NN), is proposed to automatically study and predict the chiroptical response of 2D diffractive chiral metamaterials with various geometry parameters. In the metamaterials, a gold array of T-like shaped molecules in the left handedness are fabricated and then deposited on the oxidized silicon (Si) substrate. To detailly study the influence of chiral parameters comprising the unit period, width, bridge length, and separation length, on the CD characteristics of the T-like metamaterials, both the rigorous coupled wave analysis (RCWA) method and the FC-NN approach are employed. Particularly, by calculating the CD spectra for 7358 intermediate geometries via the RCWA method, the FC-NN network is well trained and capable of predicting the chiroptical response of T-like metamaterials with different chiral parameters. Our work reveals that the T-like chiral metamaterials show the strongest CD performances in the third-order diffracted beams when considering up to four diffraction order beams, although the scattered intensities at third-order beams are far smaller than the second-order case. Furthermore, the CD spectra's bisignate feature, which represents the positive and negative signs of CD response, varies nonlinearly with the chiral parameters including the unit period, width, bridge length, and separation length, opening up new possibilities for the engineering of optical chirality. More importantly, the FC-NN network is confirmed to be a promising and powerful technique that can characterize the chiroptical response of diffractive chiral nanostructures with different geometry in a high accuracy and an ultrafast computational speed. This study set out to assess the feasibility of DL network in the characterization and engineering of 2D chiral metamaterials for the next generation hyper-sensitive detecting nano-devices.

Computational methods
The intensity of higher order diffracted beams for the T-like chiral metamaterials under LCP and RCP light irradiation are numerically calculated via the RCWA method implemented in Synopsys RSoft DiffractMOD. Notably, the wavelength-dependent permittivity of Au, Cr, SiO 2, and Si is fully incorporated in the RCWA calculations. The four geometric parameters, including the gold length, gold width, bridge length, and separation length, exhibit uniform distributions in the selected ranges. The investigated incident light is in the wavelength region of 0.2-1.775 μm, which is discretised uniformly into 64 points. Next, the CD response prediction of the chiral metamaterial can be switched to a regression problem, the purpose of which is to associate the 1 × 4 structure parameter vector with the 2 × 64 spectra vector. Using the RCWA method, we have gathered 7358 samples and used 5886 of them for training, with the left 1472 for testing. The model is built under the open-source DL framework of TensorFlow.

2D chiral metamaterials samples
The schematic illustration and optical properties of the Tlike chiral metamaterials are shown in Figure 1. To begin with, the schematics of T-like metamaterials under the circularly polarized light excitation are presented in Figure 1(a), in which the higher order diffraction beams are observed. Next, one can see the unit cell of the T-like metamaterials in Figure 1(b), where its dimensions are evidently illustrated: the T-like sample is in a gold length of l, a gold width of w, a gold bridge length of l s , and a separation length between two adjacent nanoparticles of g, resulting in a unit period of a = 2l + 2g. It is important to mention that the other three geometric parameters (i. e., w, l s , g) will be represented by the gold length l in what follows, leaving the unit period a in a certain proportion to l. This simplifies the investigation of the influence of unit period on the CD performance to the dependence of CD on the gold length. In addition, the depth profiles of the T-like metamaterials are declared here: the 30 nm gold arrays are fabricated and then transferred to an oxidized Si substrate whose SiO 2 layer is 200 nm in thickness. Additionally, a 10 nm Cr film is inserted between the gold layer and SiO 2 layer, operating as the spacer. It is noticeable that the left-handed T-like metamaterials possess the LSP resonances, which can dramatically enlarge the optical chirality of metamaterials. Figure 1(c) describes the normalized intensities of the n = 1-4 diffracted beams, irradiated by LCP light. Notably, all the traditional numerical simulations in the work are conducted via the RCWA approach. It is apparently seen from this figure that the second-order diffraction beam exhibits the largest normalized intensity, whereas the intensity is the smallest in third-order beam. The corresponding CD responses in cases of up to four diffraction order beams are presented in Figure 1(d). Surprisingly, it is found that the third-order diffraction beam exhibits the strongest CD response, in spite that this diffraction order owns the weakest normalized intensity. Thus, it is foreseeable that the CD characteristics can be engineered with an additional flexibility by changing the chirality parameters of T-like metamaterials. We propose two crucial concepts in this context: (1) the spatial distribution of diffraction order beams, satisfying the grating diffraction equation of a sin θ = nλ, where θ is the diffraction angle and n is the diffraction order [3]; (2) the normalized CD, denoted as CD = (I RCP −I LCP )/(I LCP + I RCP ), represents the relative difference between the intensities of third-order diffracted beams irradiated by LCP (I LCP ) and RCP (I RCP ).

Deep learning network
The FC-NN model is constructed to study the correlation between the four chiral parameters of the T-like metamaterials and their chiroptical response. Firstly, we collect 7358 pairs of LCP/RCP spectra obtained via the RCWA method as the dataset. Then we pre-process the dataset and shuffle it randomly. More details about the first two steps are provided in the in Supplementary Material (SM) section. Finally, we divide it into the training dataset and test dataset with the ratio of 4:1 randomly. The schematic illustration of the FC-NN is demonstrated in Figure 2(a), which consists of six layers, namely an input layer, an output layer and four hidden layers. Specially, the gold length, gold width, bridge length, and separation length of the metamaterials are the input data to be fed into the input layer, corresponding to the four neurons in the left green box. Next, the output layer contains 128 neurons, with the first 64 neurons representing predicted LCP results and the other ones standing for the predicted RCP spectra (see the right green box). In this work, we denote the number of neurons in ith hidden layer as N i , where i = 1, 2, 3 or 4, corresponding to the hidden layer. Notably, N 1 , N 2 , N 3 and N 4 (N i ∈{128, 256, 512, 1024, 1536, 2048, 3072, 4096}, i = 1,2,3,4) are viewed as the hyperparameters. The activation layer with a leaky ReLU activation function (alpha = 0.2) is inserted between every two adjacent hidden layers and behind the output layer. The mean absolute error (MAE) is used as the loss function for our regression task, which is mathematically expressed as, where spec pred i and spec real i are the predicted spectra performed by the neural network and the labeling spectra simulated by RCWA of the outcome i, respectively, and m = 16 is the size of batch data.
Before proceeding to the training process of the FC-NN network, it is important to stress that the four hyperparameters, N 1 , N 2 , N 3 and N 4 , are optimized through the Genetic Algorithm (GA) [62], whose flow diagram is depicted in Figure 2(b). Notably, the construction of GA is detailly presented in SM section. These four parameters are treated as chromosomes of the population, which means that each population has four chromosomes. The gene length of chromosomes is set to three, since a chromosome has eight different possible values. To balance the global optimization and computational complexity, the size of population is selected to be 30. The fitness value of the population is defined as, where loss is defined in Equation (1), and the subscript '10' indicates that the loss is obtained after the model undergoes 10 epochs of training. Under the optimization of GA, the optimal values of N 1 , N 2 , N 3 and N 4 are extracted to be 512, 1024, 2048, and 1024, respectively. During the running of the GA, we find that the FC-NN tends to spend more time to reach the low MAE with a smaller value of N i . As for a larger N i that usually induces more complexity for the neural network, it is more likely to overfit though less computational time is required. Here, the TensorFlow framework is utilized to build the regression network. The training, GA optimization and prediction processes were performed using a graphic card (NVIDIA GeForce GTX 2080Ti) with CUDA 10.0 on the Win10 OS.

Estimation of FC-NN model
After the construction of the FC-NN model, we start to evaluate the performance of this network in predicting the chiroptical response of 2D chiral metamaterials, with the results shown in Figure 3. Firstly, the CD properties of T-like nanostructures with different gold length (1-2μm) and bridge length (0.4-0.8l) are calculated by using both the RCWA and FC-NN methods, as demonstrated in Figures  3(a)-(i). It is clearly seen that the predicted spectra by DL model coincide well with the RCWA simulated results (i. e., the labeling spectra) in all cases. Additionally, the LCP/RCP excited resonant wavelength increases with the unit period (or gold length), which is determined by the selective excitation of available modes. Furthermore, these T-like lefthanded metamaterials can be excited by either LCP or RCP light. In particular, the modes that are excited by LCP light dominate in Figure 3(i), whereas in other figures the modes irradiated by RCP play a key role. Thus, it is safely concluded that the geometry or incident wavelength rather than the structure handedness is responsible for the above phenomenon. This provides excellent potentials for the optical chirality engineering at a specific electromagnetic mode, enabling the applications of the chiral metamaterials in sensitive chiroptical detectors. Conversely, it is found that large normalized CD responses are not necessarily contributed by large diffracted intensity irradiated by LCP/RCP light. For instance, with the gold length being 1.5 μm, the maximum CD of T-like metamaterials is obtained to be about 0.93, 0.9, and 0.84 at 0.95 μm, 0.73 μm, and 0.7 μm, respectively, whose resonance locates around 1.15 μm.
We use the Adam optimization algorithm [63] with an initial learning rate of 0.001 to train our FC-NN network. The learning rate decays to 0.0003, 0.0001, and 0.00003 at epoch of 200, 500, and 1000, respectively. Finally, our FC-NN network converges at MAE of 0.02 for the train dataset, whose value slightly increases to 0.03 for the test dataset, after the training process of 2000 epochs with the batch size of 16. Here, we use the mean absolute percentage error (MAPE) [64,65] whose definition is given in SM section and the Pearson product moment correlation coefficient (R) between the labeling spectra and the predicted spectra to evaluate the accuracy of the FC-NN model, as shown in Figure 3(j) and (k). Importantly, the predicted spectra with MAPE <5% and R > 99% are usually assumed to be of high accuracy. The most striking observation from Figure 3(j) is that the portion of MAPE<5% (red bar) is above 95%, which occupies most of the test data. Meanwhile, as shown in Figure 3(k), the portion of R > 99% (see red bar) is found to take up nearly 97%, indicating the high correlation between the predicted spectra and the labeling spectra. On the other hand, we compare the FC-NN network with RCWA and other ML methods, including k-Nearest Neighbor (KNN) [66], Decision Tree [67], Random Forest [68], and generalized regression neural network (GRNN) [69], to estimate their efficiency and accuracy. The brief description for these four ML methods is presented in SM section. As presented in Table 1, although RCWA has obvious advantage on MAPE, but it is such time-consuming that needs about 5.5 h to generate 100 samples. Though the entire ML model have a good performance on the minimum MAPE, our FC-NN network exhibits the best index with the mean MAPE of 1.07%. Regarding the computational time, our proposed FC-NN only requires 1 ms to generate 100 sample, which is four orders of magnitude faster than RCWA and at least 2-fold faster than other ML methods. Thus, FC-NN is proved to be a better choice considering the accuracy and the computational time. To verify that FC-NN has learned the rules instead of memorizing the training spectra, we utilize the lookup tables (i.e., Lagrange interpolation) to make predictions for CDs, which turns to break down on our dataset, with the comparison illustrated in Table 1.

Prediction of CD response
By means of the FC-NN model, it is feasible to study the nonintuitive dependence of CD response from chiral metamaterials on the incident wavelength and chiral parameters in an efficient and accurate manner. In order to quantify the contribution of the geometric parameters to the CD effect, we first investigate the influence of unit period on CD response. Importantly, by changing unit period of 2D metamaterials, it is able to alter its resonant wavelength and diffraction angle following a simple relation of a sinθ = nλ. More specifically, we consider the Tlike chiral metamaterials with the separation length of g = 0.2l, the bridge length of l s = 0.5l, and four different widths (0.15l, 0.2l, 0.25l, 0.3l), whose period is a = 2.4l, and utilize the DL network to predict the CD response for different values of wavelength and gold length l. These results are summarized in Figure 4. One significant finding is that a series of gold widths of the T-like nanostructures would induce different CD behaviors. Precisely, much stronger CD responses are exhibited in cases of w = 0.15l and w = 0.2l compared to the other two widths. In addition, one can acquire the bisignate CD feature for almost every l in these four cases, indicating the highly nonlinear dispersion of chiroptical responses with the gold length or unit period. Especially for the case of w = 0.2l, though comprising more red-dominant modes than that in Figure 4(a), its CD signals show multiple-bisignate characteristics. Alternatively, it is achievable to control and tailor the spatial properties of higher order diffraction beams by changing the unit period of T-like metamaterials.
We now consider the CD performance at different separation length for T-like nanostructures via the FC-NN network. While from left to right, these panels stand for the bridge lengths of l s = 0.4l, l s = 0.6l, and l s = 0.8l. Here, w = 0.2l, and g = 0.2l. (j) MAPE between the labeling spectra and the predicted spectra. The portion of MAPE <5% is presented as dark red bars. (k) Pearson product moment correlation coefficient, R, between the labeling spectra and predicted spectra. Importantly, the portion of R > 0.99 is shown as dark red bars.
An important term, space ratio = g/l, is introduced here to characterize the quantity of separation length, enabling the unit period a written in form of l. We investigate CD spectra of T-like metamaterials with different space ratios and other fixed parameters (l = 2 μm, l s = 0.4l, w = 0.15l-0.3l), as presented in Figure 5. It is found from this figure that, as expected, smaller separation lengths between two adjacent nanoparticles lead to stronger CD responses. This indicates that optical chirality of metamaterials originates from the chiral coupling of all individual molecules. Moreover, the bisignate feature is discovered for most space ratios in all cases, implying that this behavior is not determined by the separation length.
Surprisingly, the large positive CD responses seem to shift to the parameter space with smaller space ratios and λ, when increasing the width w. This may be explained by the fact that T-like metamaterials with larger widths support different electromagnetic modes when compared with the case of small width, considering that the chiroptical response in third-order diffracted beams is determined by the superposition of all excited electromagnetic modes. Also, there are more large negative values of CDs in parameter space with larger space ratios and λ, indicating that the modes irradiated by LCP dominate under these conditions.
Another key parameter that influences the CDs in the third-order diffracted beams is the bridge length, as it partially determines the shape of T-like nanostructures. Therefore, by using the FC-NN algorithm, we investigate the dependence of the CD responses on the normalized l s , which is denoted as the ratio between the bridge length l s and gold length l, under the conditions of l = 1.6 μm, g = 0.2l and four different w, as shown in Figure 6. It is seen that Tlike metamaterials with w = 0.2l (see Figure 6(b)) present the maximum CDs at each normalized l s , accompanied by the most complicated bisignate characteristics. Additionally, the negative CD responses seem to dominate over the whole parameter space in gold width of w = 0.15l, whereas the opposite is true for w = 0.25l. The positive and negative CD values in Figure 6(d) suggest that the modes in T-like metamaterials at w = 0.3l can be excited by both LCP and RCP light. Furthermore, it can be concluded that the CD responses vary nonlinearly with the normalized l s in all cases. Here, the unit period is fixed at a = 2.4l (l = 1.6 μm), which indicates that the wavelength of resonances between the incident LCP/RCP light and the electromagnetic   Figure 7. In this analysis, we consider the T-like nanostructures with four different w and the separation length of g = 0.3l, under the excitation of circular polarized light at λ = 0.8 μm. One conclusion derived from these contour CD maps predicted by the FC-NN model is that the CD responses exhibit a nonlinear variation with the gold length l. Additionally, the dispersion diagram of the CD response with the normalized l s is also not in a linear relation. Precisely, Figure 7 -d) correspond to the Tlike structure with four widths of w = 0.15l, w = 0.2l, w = 0.25l, and w = 0.3l, respectively. Here, the length of a separate gold nanoparticle l = 1.6 μm, and the separation length between two adjacent nanoparticles is fixed at g = 0.2l. by LCP light determining in parameter area of (l = 1.3-1.5 μm, normalized l s = 0.6-1.0), leaving the RCP modes dominant in the rest space. However, the influence of LCP modes on the CD responses decreases dramatically when the gold width of the T-like metamaterials turns larger. Especially for the case of w = 0.2l (see Figure 7(b)), the negative values of the CD are only observed in small regions, such as the area of (l = 1.3-1.4 μm, normalized l s = 0.48-0.52), indicating the significant degradation of bisignate feature at λ = 0.8 μm. On the other hand, the CD performance with negative values occurs at the bottom space of Figure 7(c) and Figure 7(d), which may suggest that stronger LCP modes exist under these circumstances.
Since both the gold length and the separation length between two adjacent nanoparticles determines the unit period of the T-like metamaterials, a pertinent question is how the CD responses are affected by the changes of these two parameters. To answer this question, we consider the T-like metamaterials with different l and g, irradiated by the LCP and RCP light at a wavelength of λ = 0.8 μm, with the results being shown in Figure 8. One interesting finding is that when increasing the gold width w, the large negative CD responses seem to gradually decrease. Moreover, in case of w = 0.15l, the negative values of CD response appear below the limit of space ratio = 0.35, whereas for the other widths the negative CD responses can push the limitation  and occur at the upper parameter spaces of Figure 8(b)-(d). A reasonable explanation for the above phenomena is that the electromagnetic modes induced by LCP light are highly likely to be excited in these dimensions of T-like metamaterials. Additionally, it is obvious that larger absolute values of CD are acquired when the space ratio is smaller than 0.5 in four cases, which is due to the fact that a larger separation length cause a weaker coupling effect between the gold nanoparticles. On the other hand, for most l the bisignate feature of CD performance is discovered, exhibiting a nonlinear dependence on the gold length. Particularly, at the space ratio of 0.2 in Figure 8(b), the strength of CD response seems to turn larger with a larger l. To conclude, this figure offers a clear picture of how the change of CD response is associated with unit period, providing a new understanding of chiroptical response engineering in chiral metamaterials via DL network.

Conclusion
In summary, we have proposed and utilized a DNN-based DL model to investigate the optical chirality of various 2D chiral metamaterials in the higher diffraction order beams. Both the traditional RCWA method and the DL model are employed to characterize the CD responses of T-like metamaterials with different chirality parameters, with the former algorithm assisting the training process for the latter. Particularly, we have addressed the sophisticated nonlinear dispersion of CD responses on the unit period, width, bridge length, and separation length of the chiral metamaterials using the DL network. It should be stressed that our proposed DL model is capable of predicting and optimizing the CD responses of diffractive chiral molecules in an ultrafast, highly-efficient, and exceedingly-accurate manner, which dramatically reduces the computational resources spent on numerically solving the electromagnetics equations regarding optical chirality, and switches this solution into a data-driven approach. These findings reported here shed new light on the future perspectives of DL network in accelerating the development of metamaterials and nanophotonic devices with complicated light-matter interactions involved.